Approximate String Matching Using a Bidirectional Index
نویسندگان
چکیده
We study strategies of approximate pattern matching that exploit bidirectional text indexes, extending and generalizing ideas of [6]. We introduce a formalism, called search schemes, to specify search strategies of this type, then develop a probabilistic measure for the efficiency of a search scheme, prove several combinatorial results on efficient search schemes, and finally, provide experimental computations supporting the superiority of our strategies.
منابع مشابه
FAMOUS: Fast Approximate string Matching using OptimUm search Schemes
Finding approximate occurrences of a pattern in a text using a full-text index is a central problem in bioinformatics and has been extensively researched. The introduction of practical bidirectional indices has opened new possibilities for solving the problem as they allow the search to be started from anywhere within the pattern and extended in both directions. In particular, use of search sch...
متن کاملn-Gram/2L-approximation: a two-level n-gram inverted index structure for approximate string matching
Approximate string matching is to find all the occurrences of a query string in a text database allowing a specified number of errors. Approximate string matching based on the n-gram inverted index (simply, n-gram Matching) has been widely used. A major reason is that it is scalable for large databases since it is not a main memory algorithm. Nevertheless, n-gram Matching also has drawbacks: th...
متن کاملFaster Filters for Approximate String Matching
We introduce a new filtering method for approximate string matching called the suffix filter. It has some similarity with well-known filtration algorithms, which we call factor filters, and which are among the best practical algorithms for approximate string matching using a text index. Suffix filters are stronger, i.e., produce fewer false matches than factor filters. We demonstrate experiment...
متن کاملA New Indexing Methodfor Approximate String Matching ? Gonzalo
We present a new indexing method for the approximate string matching problem. The method is based on a suux tree combined with a partitioning of the pattern. We analyze the resulting algorithm and show that the retrieval time is O(n), for 0 < < 1, whenever < 1 ? e= p , where is the error level tolerated and is the alphabet size. We experimentally show that this index outperforms by far all othe...
متن کاملApproximate String Matching
We present a new indexing method for the approximate string matching problem. The method is based on a suux tree combined with a partitioning of the pattern. We analyze the resulting algorithm and show that the retrieval time is O(n), for 0 < < 1, whenever < 1 ? e= p , where is the error level tolerated and is the alphabet size. We experimentally show that this index outperforms by far all othe...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Theor. Comput. Sci.
دوره 638 شماره
صفحات -
تاریخ انتشار 2014